Improved implementation and experimental evaluation of the max-error optimized wavelet synopses
نویسندگان
چکیده
This paper provides an improved implementation of an algorithm for building wavelet synopses for max-error metrics, recently introduced by Garofalakis and Kumar (GK) [4]. Given a storage space of size M , the GK algorithm finds a wavelet synopsis of size M , which minimizes the max (absolute or relative) error, measured over the data values, with respect to any other wavelet synopsis of size M . The running time of the GK algorithm is O ( NM log M ) and its space complexity is O ( NM ) . In this paper we improve the time and space complexities by a factor of M , reducing the running time to O ( N log M ) and the space requirement to O ( N ) . As in [4] no experimental results were shown, we present experimental comparison between the accuracy of the GK synopsis with other wavelet synopses, as well as experimental comparison between the running-time of the original GK algorithm with our improved implementation. We also apply the GK synopsis for rangequeries, built on the raw data as well as over the prefix-sums of the data, and compare it experimentally with other wavelet synopses, demonstrating an interesting similarity to another synopsis that can be computed in linear time.
منابع مشابه
Workload-Based Wavelet Synopses
This paper introduces workload-based wavelet synopses, which exploit query workload information to significantly boost accuracy in approximate query processing. We show that wavelet synopses can adapt effectively to workload information, and that they have significant advantages over previous approaches. An important aspect of our approach is optimizing synopses constructions toward error metri...
متن کاملA Fast Approximation Scheme for Probabilistic Wavelet Synopses
Several studies have demonstrated the effectiveness of Haar wavelets in reducing large amounts of data down to compact wavelet synopses that can be used to obtain fast, accurate approximate query answers. While Haar wavelets were originally designed for minimizing the overall root-mean-squared (i.e., L2-norm) error in the data approximation, the recently-proposed idea of probabilistic wavelet s...
متن کاملBuilding Data Synopses Within a Known Maximum Error Bound
The constructions of Haar wavelet synopses for large data sets have proven to be useful tools for data approximation. Recently, research on constructing wavelet synopses with a guaranteed maximum error has gained attention. Two relevant problems have been proposed: One is the size bounded problem that requires the construction of a synopsis of a given size to minimize the maximum error. Another...
متن کاملEfficient Haar+ Synopsis Construction for the Maximum Absolute Error Measure
Several wavelet synopsis construction algorithms were previously proposed based on dynamic programming for unrestricted Haar wavelet synopses as well as Haar synopses. However, they find an optimal synopsis for every incoming value in each node of a coe cient tree, even if di↵erent incoming values share an identical optimal synopsis. To alleviate the limitation, we present novel algorithms, whi...
متن کاملOn the Optimality of the Greedy Heuristic in Wavelet Synopses for Range Queries
In recent years wavelet based synopses were shown to be effective for approximate queries in database systems. The simplest wavelet synopses are constructed by computing the Haar transform over a vector consisting of either the raw-data or the prefix-sums of the data, and using a greedy-heuristic to select the wavelet coefficients that are kept in the synopsis. The greedy-heuristic is known to ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005